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Variables and Data Types 

^/adabT^ssignmen^^^ 


»> x=5 

»> X 

5 


Calculations With Variables 


>» x+2 

Sum of two variables 

7 

»> x-2 

Subtraction of two variables 

3 

»> x*2 

Multiplication of two variables 

10 


»> x**2 

Exponentiation of a variable 

25 


»> x%2 

Remainder of a variable 

l 

>» x/float(2) 

Division of a variable 

2.5 



Types and Type Conversion 


str () 

'5', '3.45', 'True' 

Variables to strings 

int() 

5, 3, 1 

Variables to integers 

float () 

o 

5-1 

o 

LO 

Variables to floats 

bool () 

True, True, True 

Variables to booleans 


>» help(str) 


Strings 


>» my_string = 'thisStringlsAwesome' 
>» my_string 
'thisStringlsAwesome' 


String Operations 


>>> my_string * 2 

'thisStringlsAwesomethisStringlsAwesome' 
>>> my_string + 'Innit' 

'thisStringlsAwesomelnnit' 

>>> 'm' in my_string 
True 


Lists Also see NumPy Arrays 


»> a = ' is ' 

»> b = 'nice' 

»> my_list = ['my', 'list', a, b] 

>» my_list2 = [[4,5, 6,7], [3,4,5, 6]] 


Selecting List Elements_ index starts at o 


Subset 


»> my list [ 1 ] 

Select item at index i 

»> my list [-3] 

Select 3rd last item 

Slice 


»> my list [1:3] 

Select items at index i and 2 

»> my list [ 1: ] 

Select items after index o 

»> my list [ : 3 ] 

Select items before index 3 

»> my list [ : ] 

Subset Lists of Lists 

Copy myjist 

»> my list2[l] [0] 

»> my list2 [ 1 ] [: 2 ] 

my_list[list][itemOfList] 


List Operations 



»> my list. index (a) 

Get the index of an item 

»> my list. count (a) 

Countan item 

»> my list.append ('!' ) 

Append an item at a time 

»> my list. remove ( ' ! ' ) 

Remove an item 

>» del (my list [0:1]) 

Remove an item 

»> my list. reverse () 

Reverse the list 

»> my list.extend ('!' ) 

Append an item 

>>> my list.pop(-l) 

Remove an item 

>» my list.insert(0 ,'!' ) 

Insert an item 

>» my list, sort () 

Sort the list 


String Operations index starts at o 


»> my_string [ 3 ] 
»> my_string [ 4 : 9 ] 


String Methods 


»> my string. upper () 

String to uppercase 

»> my string. lower () 

String to lowercase 

»> my string.count (' w' ) 

Count String elements 

»> my string. replace (' e ' , 'i') 

Replace String elements 

»> my string. strip () 

Strip whitespaces 


Import libraries 

p, a ^. a . s P^WH 

a 

>» import numpy 

Data analysis 

Machine learning 

>» import numpy as np 



Selective import 

NumPy 

•^matplotlib 

>>> from math import pi 

Scientific computing 

2D plotting 


Install Python 



ANACONDA 



spyder 



jupyter 



Leading open data science platform 
powered by Python 


Free IDE that is included 
with Anaconda 


Create and share 
documents with live code, 
visualizations, text,... 


Numpy Arrays_ Also see Lists 


»> my_list = [1, 2, 3 , 4] 

»> my_array = np . array (my_list) 

»> my_2darray = np . array ([[1,2,3], [4,5,6]]) 


Selecting Numpy Array Elements_ index starts at o 


Subset 


»> my array [1] 

2 

Select item at index 1 

Slice 


»> my array [0:2] 

Select items at index 0 and 1 

array([1, 2]) 


Subset 2D Numpy arrays 

>>> my 2darray[:,0] 

my_ 2 darray[rows, columns] 

array([1, 4] ) 



Numpy Array Operations 

|>» my array > 

3 


array( [False, False, 

False, True] , dtype=bool) 

»> my array * 

2 


array([2, 4, 6, 

8] ) 


>» my array + 

np, 

.array([5, 6, 7, 8]) 

array([6, 8, 10, 

12] ) 

1 


Numpy Array Functions 


»> my array, shape 

Get the dimensions of the array 

»> np . append (other array) 

Append items to an array 

»> np. insert (my array, 1, 5) 

Insert items in an array 

>» np. delete (my array, [1]) 

Delete items in an array 

»> np.mean (my array) 

Mean of the array 

»> np.median (my array) 

Median of the array 

»> my array. corrcoef () 

Correlation coefficient 

»> np.std(my array) 

Standard deviation 
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Jupyter Notebook 
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Saving/Loading Notebooks 


Create new notebook 


Make a copy of the 
current notebook 


Save current notebook 
and record checkpoint 

Preview of the printed 
notebook 

Close notebook & stop 
running any scripts 


Writing Code And Text 


File Edit View Insert 


h Notebook ► + 


open- 


Open an existing 
'"notebook 


Make a Copy... 

Rename.... 

■ Save and Checkpoint 


Revert to Checkpoint ► 


. Print Preview 
Download as 


Trusted Notebook 

Close and Halt 


Rename notebook 

Revert notebook to a 
previous checkpoint 

Download notebook as 
-1 Python notebook 

- Python 

- HTML 

- Markdown 
-reST 

- LaTeX 
-PDF 


Code and text are encapsulated by 3 basic cell types: markdown cells, code 
cells, and raw NBConvert cells. 


Edit Cells 


Cut currently selected cells 
to clipboard .. 

Paste cells from 
clipboard above 
current cell 


Paste cells from 
clipboard on top '••• 
of current cel 

Revert "Delete Cells" 
invocation 

Merge current cell . 
with the one above 


Move current cell up **' 

Adjust metadata 
underlying the 
current notebook 

Remove cell .•••'’ 
attachments 
Paste attachments of 
current cell 


Edit 


Insert 


’ Cut Cells 
Copy Cells ’ ’ 

• Paste Cells Above 
Paste Cells Below • ' ' 

‘ Paste Cells & Replace 
Delete Cells ..••••' 
Undo Delete Cells 


Split Cell 
Merge Cell Above 
Merge Cell Below. 

• Move Cell Up 
Move Cell Down . 


Edit Notebook Metadata 
Find and Replace. 


. Cut Cell Attachments 
Copy Cell Attachments . 
. Paste Cell Attachments 


Insert Image . 


Split up a cell from 
... current cursor 
position 

Merge current cell 
with the one below 
Move current cell 
down 

Find and replace 
in selected cells 


Copy attachments of 
current cell 

.Insert image in 

selected cells 


Insert Cells 


Add new cell above the 
current one . 


Insert Cell Kernel 

■ Insert Cell Above £ 
Insert Cell Below * ‘ 


Add new cell below the 
current one 


Working with Different Programming Languages 


Kernels provide computation and communication with front-end interfaces 
like the notebooks. There are three main kernels: 


IPtV]: 

IPython 


R 


IJIM 

IRkernel IJulia 

Installing Jupyter Notebook will automatically install the IPython kernel. 


Widgets 


Notebook widgets provide the ability to visualize and control changes 
in your data, often as a control like a slider, textbox, etc. 

You can use them to build interactive GUIs for your notebooks or to 
synchronize stateful and stateless information between Python and 
JavaScript. 


Restart kernel ... 

Restart kernel & run 
all cells 

Restart kernel & run. 
all cells 


Command Mode: 


Kernel Widgets Help. 

..... 1 

Interrupt • * * * 

E3 


’ * Restart 



Restart & Clear Output * * 



‘ ■ Restart & Run All 



Reconnect. 

.. 


• • Shutdown 



Change kernel ► • 




..... Interrupt kernel 

Interrupt kernel & 
clear all output 

Connect back to a 
remote notebook 

■ Run other installed 
kernels 


Download serialized 
state of all widget 
models in use ... 


Widgets Help 


Save Notebook with Widgets • ’ 


■ • Download Widget State 


Embed Widgets. 



Save notebook 
■•‘with interactive 
widgets 

Embed current 
widgets 


© 


^ jupyter MyJwpyterNotebook Last Checkpoint: a few seconds ago (unsaved changes] 
File Edit View Insert Cell Kernel Widgets Help 
N ■ C Code 


^ ilk 

6 7 


8 9 10 


E3 

12 


Logout 

15 

| Pytlhon 3 G 

13 14 


in [ ]: 


Edit Mode: 


Copy cells from 
clipboard to current 
,.••■’ cursor position 

Paste cells from 
clipboard below 
current cell 

..... Delete current cells 


Executing Cells 


Run selected cell(s) 

Run current cells down 
and create a new one •. 
above 

Run all cells above the . 
current cell 

Change the cell type of 
current cell 


toggle, toggle .... 
scrolling and clear 
all output 


Cell Kernel Widgets 

• Run Cells 

Run Cells and Select Below * 

• Run Cells and Insert Below 

Run All . 

. Rlt Al Above 
Run All Below. 


. Cell Type 


Current Outputs 
• All Output 


Run current cells down 
and create a new one 
below 


1. Save and checkpoint 

2. Insert cell below 

3. Cut cell 

4. Copy cell(s) 

5. Paste cell(s) below 

6. Move cell up 

7. Move cell down 

8. Run current cell 


9. Interrupt kernel 

10. Restart kernel 

11. Display characteristics 

12. Open command palette 

13. Current kernel 

14. Kernel status 

15. Log out from notebook server 


. Run all cells 

Run all cells below 
the current cell 

toggle, toggle 
■ •scrollingand clear 
current outputs 


Asking For Help 


Walk through a Ul tour 

Edit the built-in 
keyboard shortcuts ..... 


Description of 
markdown available 
in notebook 


View Cells 


Toggle display of Jupyter 
logo and filename 


Toggle line numbers 
in cells 


• Toggle Header 
Toggle Toolbar • ’ 

• Toggle Line Numbers 

Cell Toolbar ► 


Toggle display of toolbar 

Toggle display of cell 
. action icons: 

- None 

- Edit metadata 

- Raw cell format 

- Slideshow 

- Attachments 
-Tags 


Python help topics . 
NumPy help topics ... 
Matplotlib help topics 
Pandas help topics . 


Help 

• User Interface Tour 
Keyboard Shortcuts . • * ’ 

• Edit Keyboard Shortcuts 


Notebook Help 

0 

• • Markdown 

0 

Jupyter-contrib 


nbextensions 

0 

• Python 

0 

IPython 

0 

. NumPy 

0 

SciPy 

0 

■ Matplotlib 

0 

SymPy 

0 

• • pandas 

0 


List of built-in keyboard 
shortcuts 

. Notebook help topics 

Information on 
. unofficial Jupyter 
Notebook extensions 

. IPython help topics 
. SciPy help topics 
. SymPy help topics 
About Jupyter Notebook 
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NumPy Basics 
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NumPy 


The NumPy library is the core library for scientific computing in 
Python. It provides a high-performance multidimensional array 
object, and tools for working with these arrays. 


Use the following import convention: 

»> import numpy as np 


NumPy Arrays 



NumPy 


iD array 

HTTP" 


2 D array 


axis i. 



1.5 

2 

3 


4 

5 

6 



Creating Arrays 


»> a = np . array ( [ 1,2,3 ] ) 

>>> b = np . array ( [ ( 1.5,2,3 ) , (4,5,6)], dtype = float) 

> c = np.array([[ (1.5,2,3) , (4,5,6)], [(3,2,1), (4,5,6)]], 

dtype = float) 


Initial Placeholders 


»> np. zeros ( (3,4) ) 

»> np. ones ((2,3,4) ,dtype=np . intl6 
>>> d = np.arange(10,25,5) 

»> np. linspace (0,2,9) 

»> e = np. full ( (2,2) , 7) 

»> f = np . eye (2) 

»> np . random, random ( (2,2) ) 

»> np . empty ( (3,2) ) 


Create an array of zeros 
Create an array of ones 
Create an array of evenly 
spaced values (step value) 

Create an array of evenly 
spaced values (number of samples) 
Create a constant array 
Create a 2X2 identity matrix 
Create an array with random values 
Create an empty array 


1/0 


Saving & Loading On Disk 


»> np . save ( 'my_array' , a) 

»> np . savez (' array. npz ' , a, b) 
»> np . load ( ' my_array. npy' ) 


Saving & Loading Text Files 


>» np . loadtxt ( "myfile . txt" ) 

»> np . genf romtxt ( "my_file . csv" , delimiter= ' , ' ) 

>» np.savetxt( "myarray.txt" , a, delimiter=" ") 


Data Types 


>» np.int64 

Signed 64-bit integer types 

>» np.float32 

Standard double-precision floating point 

>» np. complex 

Complex numbers represented by 128 floats 

»> np.bool 

Boolean type storing true and false values 

>» np. object 

Python object type 

»> np. string 

Fixed-length string type 

»> np.Unicode 

Fixed-length Unicode type 


Inspecting Your Array 


»> a. shape 

Array dimensions 

»> len(a) 

Length of array 

»> b. ndim 

Number of array dimensions 

»> e.size 

Number of array elements 

>>> b.dtype 

Data type of array elements 

»> b. dtype . name 

Name of data type 

»> b. astype (int) 

Convert an array to a different type 


Asking For Help 


>» np . inf o (np . ndarray. dtype) 


Array Mathematics 


Arithmetic Operations 


»> g = a - b 

array ([ [-0.5, 0. , 0. ], 

[-3. , -3. , -3. ]]) 

Subtraction 

»> np. subtract (a, b) 

Subtraction 

»> b + a 

array([[2.5, 4. , 6. ], 

[ 5. , 7. , 9. ]]) 

Addition 

»> np.add(b,a) 

Addition 

»> a / b 

array ([[ 0.66666667, i* t t* ], 

[ 0.25 , 0.4 , 0.5 ]]) 

Division 

>>> np.divide(a,b) 

Division 

»> a * b 

array ([[ 1.5, 4. , 9. ], 

[ 4. , 10. , 18. ]]) 

Multiplication 

»> np .multiply (a, b) 

Multiplication 

»> np.exp(b) 

Exponentiation 

»> np . sqrt (b) 

Square root 

»> np . sin (a) 

Print sines of an array 

»> np . cos (b) 

Element-wise cosine 

>>> np.log(a) 

Element-wise natural logarithm 

>» e . dot (f) 

array ([[ 7., 7.], 

[7., 7.]]) 

Dot product 


Comparison 


»> a == b 

array([ [False, True, True], 

[False, False, False] ], dtype=bool) 
»> a < 2 

array ([True, False, False], dtype=bool) 
»> np . array_equal (a, b) 


Aggregate Functions 


>» a. sum () 

Array-wise sum 

>» a .min () 

Array-wise minimum value 

>» b.max(axis=0) 

Maximum value of an array row 

>» b . cumsum (axis = l) 

Cumulative sum of the elements 

»> a .mean () 

Mean 

»> b.median () 

Median 

>» a.corrcoef () 

Correlation coefficient 

»> np.std(b) 

Standard deviation 


Element-wise comparison 

Element-wise comparison 
Array-wise comparison 


Copying Arrays 


»> h = a. view () 

>» np.copy(a) 

>» h = a. copy () 

Create a view of the array with the same data 
Create a copy of the array 

Create a deep copy of the array 



Sorting Arrays 


»> a. sort () 

Sort an array 

»> c . sort (axis = 0) 

Sort the elements of an array's axis 


Subsetting, Slicing, Indexing Also see Lists 


Subsetting 

»> a [2 ] 

3 

»> b [ 1,2 ] 

6.0 

Slicing 

»> a [0:2] 
array([1, 2]) 

»> b [0 : 2, 1] 

array([ 2., 5.]) 

»> b [ : 1 ] 

array ( tfl. 5, 2., 3.]]) 

»> c[l, . . . ] 

array ( [[[ 3., 2., 1.], 

[ 4., 5., 6.]]]) 


rrm 


mm 


Select the element at the 2nd index 

Select the element at row o column 2 
(equivalentto b[i] [2]) 

Select items at index o and 1 

Select items at rows o and 1 in column 1 

Select all items at row o 
(equivalentto c; 0:1, :]) 

Same as [l, :, : ] 


»> a [ : : -l ] Reversed array a 

array([3, 2, 1]) 

Boolean Indexing 

|»> a [a<2 ] ^Tin Select elements from a less than 2 

array ( [ 1 ] ) - 1 - 1 


Fancy Indexing 

»> b [ [1, 0, 1, 0], [0, 1, 2, 0]] 
array ( [ 4. , 2. , 6. , 1.5]) 


Select elements (1,0), (0,1), (1,2) and (0,0) 


»> b [ [1, 0, 1, 0]] [ 

array ( [ [ 4. ,5. , 6. 

' " [1.5, 2. , 3. 


, [ 0 , 1 , 2 , 0 ]] 

4. ], 

1.5], 


Select a subset of the matrix’s rows 
and columns 


1.5]]) 


Array Manipulation 


Transposing Array 

l»> i = np . transpose (b) 

Permute array dimensions 

l»> i.T 

Permute array dimensions 

Changing Array Shape 

l»> b. ravel () 

Flatten the array 

l»> g. reshape (3,-2) 

Reshape, but don’t change data 

Adding/Removing Elements 

l>>> h.resize ( (2, 6)) 

Return a new array with shape (2,6) 

l>>> np.append(h,g) 

Append items to an array 

l»> np.insert (a, 1, 5) 

Insert items in an array 

l»> np. delete (a, [ 1 ] ) 

Delete items from an array 

Combining Arrays 

l>>> np.concatenate ( (a,d),axis=0 ) 

Concatenate arrays 

array ( [ 1, 2, 3, 10, 15, 20]) 

l>» np. vstack ( (a, b) ) 

Stack arrays vertically (row-wise) 

1 array(Ct IS- r 2 . , 3. ] , 

[1.5, 2 . , 3 . ] , 

| 4 . , 5. , 6. ]]) 

l»> np. r [e, f ] 

Stack arrays vertically (row-wise) 

l>>> np.hstack ( (e, f)) 

Stack arrays horizontally (column-wise) 

1 array(f[ 7., 7., 1., 0.], 

[ 7., 7., 0., 1.]]) 

I>» np. column stack ( (a, d) ) 

Create stacked column-wise arrays 

1 array ( [ [ 1, 10], 

[ 2, 15], 
it 3, 20]]) 
l>>> np.c [a,d] 

Create stacked column-wise arrays 

Splitting Arrays 

l>» np . hsplit (a, 3) 

Split the array horizontally at the 3rd 

[array([1] ) ,array( [2] ), array ( [3 ]) ] 

index 

l>» np.vsplit (c, 2) 

Split the array vertically at the 2nd index 

1 [array ([[[ 1.5, 2. , 1. ], 

r 4. , 5. , 6. ]]]), 
array Ctft 3., 2., 3.], 

[4., 5., 6.]]])] 
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SciPy - Linear Algebra 

Learn More Python for Data Science Interactively at www.datacamp.com 


SciPy 


The SciPy library is one of the core packages for 
scientific computing that provides mathematical @ SciPy 


algorithms and convenience functions built on the 
NumPy extension of Python. 


Interacting With NumPy 


Also see NumPy 


>» import numpy as np 
»> a = np . array ( [ 1,2,3] ) 

»> b = np. array ([ (l+5j,2j,3j) , (4j,5j,6j)]) 

»> c - np . array ([[(1.5,2,3), (4,5,6)], [ (3,2,1) , 


(4,5,6)]]) 


Index Tricks 


»> np .mgrid [ 0 : 5,0 : 5 ] 

»> np . ogrid [ 0 : 2,0 : 2 ] 

»> np. r_ [ [ 3, [ 0 ] *5, -1:1:10 j ] 
»> np.c_[b,c] 


Create a dense meshgrid 
Create an open meshgrid 
Stack arrays vertically (row-wise) 
Create stacked column-wise arrays 


Shape Manipulation 


»> np. transpose (b) 

Permute array dimensions 

»> b.flatten () 

Flatten the array 

»> np. hstack ( (b, c) ) 

Stack arrays horizontally (column-wise) 

»> np. vstack ( (a, b) ) 

Stack arrays vertically (row-wise) 

>» np . hsplit (c, 2) 

Split the array horizontally at the 2nd index 

>» np . vpslit (d, 2) 

Split the array vertically at the 2nd index 


Polynomials 


>» from numpy import polyld 
>» p = polyld ( [ 3,4,5 ] ) 


Create a polynomial object 


Vectorizing Functions 


>» def myfunc (a) : 


if a < 0: 


return a*2 


else: 


return a/2 


>>> np.vectorize(myfunc) 

Vectorize functions 


Type Handling 


»> np. real (c) 

>>> np.imag(c) 

»> np . real_if_close (c,tol=1000) 
»> np.cast['f'] (np.pi) 


Return the real part of the array elements 
Return the imaginary part of the array elements 
Return a real array if complex parts close to o 
Cast object to a data type 


Other Useful Functions 


»> np . angle (b, deg=True) 
>>> g = np.linspace(0,np.pi,num=5) 

»> g [3 : ] += np .pi 
»> np.unwrap (g) 

»> np. logspace (0,10,3) 

>» np.select([c<4],[c*2]) 

»> misc. factorial (a) 

>» misc . comb (10,3, exact=True) 
>>> misc.central_diff_weights (3) 
>» misc. derivative (myfunc, 1 . 0) 


Return the angle of the complex argument 
Create an array of evenly spaced values 

(number of samples) 

Unwrap 

Create an array of evenly spaced values (log scale) 
Return values from a list of arrays depending on 
conditions 
Factorial 

Combine N things taken at k time 
Weights for Np-point central derivative 
Find the n-th derivative of a function at a point 


Linear Algebra 


You’ll USe the linalg and sparse modules. Note that scipy. linalg Contains and expands on numpy. linalg. 


Also see NumPy 


>» from scipy import linalg, sparse 


Creating Matrices 


>>> A = np.matrix(np.random.random((2,2) 
»> B = np . asmatrix (b) 

»> C = np .mat (np . random, random ( (10,5) ) ) 
»> D = np .mat ( [ [3,4 ] , [5,6]]) 


Basic Matrix Routines 


Creating Sparse Matrices 


Sparse Matrix Routines 


Sparse Matrix Functions 


>>> sparse.linalg.expm(I) 


Sparse matrix exponential 


Matrix Functions 


Inverse 


»> A. I 

Inverse 

»> linalg. inv (A) 

Inverse 

»> A. T 

Tranpose matrix 

»> A. H 

Conjugate transposition 

>>> np.trace(A) 

Trace 

Norm 


»> linalg. norm (A) 

Frobenius norm 

>>> linalg.norm(A, 1 ) 

Li norm (max column sum) 

»> linalg. norm (A, np . inf ) 

L inf norm (max row sum) 

Rank 


>>> np.linalg.matrix rank(C) 

Matrix rank 

Determinant 


»> linalg. det (A) 

Determinant 

Solving linear problems 


»> linalg. solve (A, b) 

Solver for dense matrices 

>>> E = np.mat(a).T 

Solver for dense matrices 

>>> linalg.lstsq(D,E) 

Least-squares solution to linear matrix 
equation 

Generalized inverse 


»> linalg.pinv (C) 

Compute the pseudo-inverse of a matrix 
(least-squares solver) 

»> linalg.pinv2 (C) 

Compute the pseudo-inverse of a matrix 
(SVD) 


»> F = np.eye(3, k=l) 

Create a 2 X 2 identity matrix 

»> G = np .mat (np . identity (2) ) 

Create a 2 x 2 identity matrix 

»> C[C > 0.5] = 0 


»> H = sparse, csr matrix (C) 

Compressed Sparse Row matrix 

»> I = sparse, esc matrix (D) 

Compressed Sparse Column matrix 

»> J = sparse.dok matrix (A) 

Dictionary Of Keys matrix 

»> E . todense () 

Sparse matrix to full matrix 

»> sparse . isspmatrix esc (A) 

Identify sparse matrix 


Addition 


»> np.add(A,D) 

Addition 

Subtraction 


»> np . subtract (A, D) 

Subtraction 

Division 


»> np . divide (A, D) 

Division 

Multiplication 


»> np .multiply (D, A) 

Multiplication 

»> np . dot (A, D) 

Dot product 

>>> np.vdot(A,D) 

Vector dot product 

»> np . inner (A, D) 

Inner product 

>>> np.outer(A,D) 

Outer product 

>>> np.tensordot(A,D) 

Tensor dot product 

»> np.kron(A,D) 

Kronecker product 

Exponential Functions 


»> linalg. expm (A) 

Matrix exponential 

>>> linalg.expm2(A) 

Matrix exponential (TaylorSeries) 

»> linalg. expm3 (D) 

Matrix exponential (eigenvalue 
decomposition) 

Logarithm Function 


»> linalg. logm (A) 

Matrix logarithm 

Trigonometric Tunctions 

Matrix sine 

»> linalg. sinm (D) 

»> linalg. cosm (D) 

Matrix cosine 

»> linalg. tanm (A) 

Matrix tangent 

Hyperbolic Trigonometric Functions 


»> linalg. sinhm (D) 

Hypberbolic matrix sine 

>>> linalg.coshm(D) 

Hyperbolic matrix cosine 

>>> linalg.tanhm(A) 

Hyperbolic matrix tangent 

Matrix Sign Function 


»> np.sigm(A) 

Matrix sign function 

Matrix Square Root 


»> linalg. sqrtm (A) 

Matrix square root 

Arbitrary Functions 


»> linalg. funm (A, lambda x: x*x) 

Evaluate matrix function 


Decompositions 


Inverse 


>» sparse . linalg. inv (I) 

Inverse 

Norm 


>» sparse . linalg. norm (I) 

Norm 

Solving linear problems 


>» sparse . linalg. spsolve (H, I) 

Solver for sparse matrices 


Eigenvalues and Eigenvectors 

»> la, v = linalg. eig (A) 

»> 11, 12 = la 
»> v[ : , 0] 

»> v[ : , 1] 

»> linalg. eigvals (A) 

Singular Value Decomposition 

»> U,s,Vh = linalg. svd (B) 

>>> M,N = B.shape 

>>> Sig = linalg.diagsvd(s,M,N) 

LU Decomposition 

»> P, L, U = linalg. lu (C) 


Solve ordinary or generalized 
eigenvalue problem for square matrix 
Unpack eigenvalues 
First eigenvector 
Second eigenvector 
Unpack eigenvalues 

Singular Value Decomposition (SVD) 
Construct sigma matrix in SVD 

LU Decomposition 


Sparse Matrix Decompositions 


>>> la, v = sparse.linalg.eigs(F,1) 

Eigenvalues and eigenvectors 

»> sparse . linalg. svds (H, 2) 

SVD 


Asking For Help 


I »> help (scipy. linalg. diagsvd) 
»> np . info (np .matrix) 
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Pandas 


The Pandas library is built on NumPy and provides easy-to-use 
data structures and data analysis tools for the Python 
programming language. pandasR, 

yit = P'xi t + Hi+eit | • 

Use the following import convention: 

»> import pandas as pd 


Pandas Data Structures 


[ Series 

A one-dimensional labeled array 

a 

3 


capable of holding any data type ^ 

b 

-5 


/ 

— 

— 


Index 

c 

7 



d 

4 


»> s = pd. Series ([ 3, -5, 7, 4], index= [ ' a ' , 'b' 


c ' , 

r ' d ' ] ) 

[ DataFrame 


Columns 


Index 


I Country I Capital" Ipopuiationl A two-dimensional labeled 
data structure with columns 
of potentially different types 


Belgium 

Brussels 

11190846 

India 

New Delhi 

1303171035 

Brazil 

Brasilia 

207847528 


>» data = {'Country': ['Belgium', 'India', 'Brazil'], 

'Capital': ['Brussels', 'New Delhi', 'Brasilia'], 
'Population': [11190846, 1303171035, 207847528]} 

>>> df = pd.DataFrame(data, 

columns = [ 'Country' , 'Capital' , ' Population' ]) 


Asking For Help 


»> help (pd. Series . loc) 


Selection Also see NumPy Arrays 


[Getting 

»> s [ 'b' ] 
-5 



Get one element 

»> df [1: ] 



Get subset of a DataFrame 

Country 

Capital 

Population 


1 India 

New Delhi 

1303171035 


2 Brazil 

Brasilia 

207847528 



’ Selecting, Boolean Indexing & Setting 

By Position 

»> df .iloc ( [0] , [0] ) 

' Belgium ' 

»> df .iat ( [0] , [0] ) 

Select single value by row & 
column 

' Belgium ' 


By Label 

»> df.loc([0], ['Country']) 

'Belgium' 

>» df.at([0], ['Country']) 

'Belgium ' 

Select single value by row & 
column labels 

By Label/Position 

»> df.ix[2] 

Country Brazil 

Capital Brasilia 

Population 207847528 

Select single row of 
subset of rows 

»> df . ix [:,' Capital ' ] 

0 Brussels 

1 New Delhi 

2 Brasilia 

Select a single column of 
subset of columns 

>>> df.ix [1 ,' Capital ' ] 

Select rows and columns 

'New Delhi' 


Boolean Indexing 

»> s [~ (s > 1) ] 

»> s [ (s < -1) | (s > 2) ] 

Series s where value is not >1 
s where value is <-i or >2 

»> df[df [ 'Population' ] >1200000000 ] 

Use filter to adjust DataFrame 

Setting 

»> s [ 'a' ] = 6 

Set index a of Series s to 6 



»> s . drop ( [ ' a ' , ' c' ] ) Drop values from rows (axis=o) 

»> df . drop (' Country' , axis=l) Drop values from columns(axis=i) 


Sort & Rank 


»> 

»> 

»> 


df.sort_index() 

df . sort_values (by=' Country' ) 

df.rank() 


Sort by labels along an axis 
Sort by the values along an axis 
Assign ranks to entries 


Retrieving Series/DataFrame Information 


Basic Information 


»> df. shape 

(rows,columns) 

>>> df.index 

Describe index 

>>> df.columns 

Describe DataFrame columns 

»> df . info () 

Info on DataFrame 

>>> df.count() 

Number of non-NA values 


Summary 


»> df.sum() 

Sum of values 

»> df . cumsum () 

Cummulative sum of values 

>>> df.min()/df.max() 

Minimum/maximum values 

>» df . idxmin ()/df . idxmax () 

Minimum/Maximum index value 

»> df . describe () 

Summary statistics 

>>> df.mean() 

Mean of values 

»> df. median () 

Median of values 


Applying Functions 


»> f = lambda x: x*2 


>» df.apply(f) 

Apply function 

»> df . applymap (f) 

Apply function element-wise 


Data Alignment 

pntemamat^lignment 


NA values are introduced in the indices that don’t overlap: 


»> 

s3 = pd. Series([7, 

-2, 3], index=[ 'a' , 

' c' , 

'd']) 

»> 

s + s3 




a 

10.0 




b 

NaN 




c 

5.0 




d 

7.0 






Arithmetic Operations with Fill Methods 


You can also do the internal data alignment yourself with 
the help of the fill methods: 
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Scikit-learn 


Scikit-learn is an open source Python library that 
implements a range of machine learning, 
preprocessing, cross-validation and visualization 
algorithms using a unified interface. 


A Basic Example 


»> from sklearn import neighbors, datasets, preprocessing 
»> from sklearn.model_selection import train_test_split 
»> from sklearn.metrics import accuracy_score 
»> iris = datasets.load_iris() 

»> X, y » iris.data[:, :2], iris.target 

»> X_train, X_test, y_train, y_test= train_test_split (X, y, random_state=33) 
»> scaler = preprocessing. StandardScaler () .fit (X_train) 

»> X_train = scaler.transform(X_train) 

»> X_test = scaler.transform(X_test) 

»> knn = neighbors . KNeighborsClassifier (n_neighbors=5) 

»> knn .fit (X_train, y_train) 

»> y_pred = knn . predict (X_test) 

»> accuracy_score (y_test, y_pred) 


Loading The Data_ Also see NumPy& Pandas 


Your data needs to be numeric and stored as NumPy arrays or SciPy sparse 
matrices. Other types that are convertible to numeric arrays, such as Pandas 
DataFrame, are also acceptable. 

»> import numpy as np 

»> X = np . random, random ( (10,5) ) 

»> y = np. array ([ 'M' , 'M'F'F' , 'M'F' , 'M' , 'M'F'F'F' ]) 
»> X[X < 0.7] = 0 



Create Your Model 

Supervised Learning Estimators 


Linear Regression 

»> from sklearn.linear_model import LinearRegression 
»> lr = LinearRegression(normalize=True) 

Support Vector Machines (SVM) 

»> from sklearn. svm import SVC 
»> svc = SVC (kernel= ' linear ' ) 

Naive Bayes 

»> from sklearn.naive_bayes import GaussianNB 
»> gnb = GaussianNB () 

KNN 

»> from sklearn import neighbors 

»> knn = neighbors . KNeighborsClassifier (n_neighbors=5) 


Unsupervised Learning Estimators 


Principal Component Analysis (PCA) 

»> from sklearn.decomposition import PCA 
»> pea = PCA (n_components=0.95) 

K Means 

»> from sklearn.cluster import KMeans 
»> k_means = KMeans(n_clusters=3, random_state=0) 



Supervised learning 

»> lr .fit (X, y) 

Fit the model to the data 

»> knn.fit (X train, y train) 


>>> svc.fit (X train, y train) 


Unsupervised Learning 

Fit the model to the data 

»> k means.fit (X train) 

»> pea model = pea.fit transform (X train) 

Fit to data, then transform it 


Prediction 


Supervised Estimators 

Predict labels 

l»> y pred = svc .predict (np.random.random( (2,5))) 

l»> y pred = lr.predict (X test) 

Predict labels 

l»> y pred = knn.predict proba (X_test) 

Unsupervised Estimators 

Estimate probability of a label 

l>>> y pred = k means.predict(X test) 

Predict labels in clustering algos 



Preprocessing The Data 


[Standardization 


^Encoding Categorical Features 

l»> from sklearn. preprocessing import StandardScaler 
l>» scaler = StandardScaler () .fit (X train) 
l>>> standardized X = scaler.transform(X train) 
l»> standardized X test = scaler. transform (X test) 

»> from sklearn.preprocessing import LabelEncoder 
»> enc = LabelEncoder () 

»> y = enc.fit transform (y) 

^Normalization 

1 1 

Imputing Missing Values 


»> from sklearn.preprocessing import Normalizer 
»> scaler = Normalizer () .fit (X_train) 

>>> normalized_X = scaler.transform(X_train) 

»> normalized_X_test = scaler.transform(X_test) 


»> from sklearn.preprocessing import Imputer 

»> imp = Imputer(missing_values=0, strategy= 'mean' , axis=0) 

»> imp .fit_transform (X_train) 


Binarization 


■ Generating Polynomial Features 


>» from sklearn.preprocessing import Binarizer 
>» binarizer = Binarizer (threshold=0.0) .fit (X) 
>» binary_X = binarizer . transform (X) 


»> from sklearn. preprocessing import PolynomialFeatures 
»> poly = PolynomialFeatures (5) 

»> poly .fit_transform (X) 


Evaluate Your Model’s Performance 


Classification Metrics 


Accuracy Score 

>>> knn.score(X_test, y_test) 

»> from sklearn.metrics import accuracy_score 
»> accuracy_score (y_test, y_pred) 

Classification Report 

>» from sklearn.metrics import classification_report 
»> print(classification_report (y_test, y_pred)) 

Confusion Matrix 

>» from sklearn.metrics import confusion_matrix 
»> print(confusion_matrix(y_test, y_pred)) 


Estimator score method 
Metric scoring functions 


Precision, recall, fi-score 
and support 


Regression Metrics 


Mean Absolute Error 

»> from sklearn.metrics import mean_absolute_error 
»> y_true = [3, -0.5, 2] 

»> mean_absolute_error(y_true, y_pred) 

Mean Squared Error 

»> from sklearn.metrics import mean_squared_error 
»> mean_squared_error (y_test, y_pred) 

R 2 Score 

>>> from sklearn.metrics import r2_score 
l»> r2_score (y_true, y_pred) 


Clustering Metrics 


Adjusted Rand Index 

»> from sklearn.metrics import adjusted_rand_score 
»> adjusted_rand_score(y_true, y_pred) 

Homogeneity 

»> from sklearn.metrics import homogeneity_score 
»> homogeneity_score (y_true, y_pred) 

V-measure 

»> from sklearn.metrics import v_measure_score 
»> metrics.v_measure_score(y_true, y_pred) 


Cross-Validation 


»> from sklearn.cross_validation import cross_val_score 
»> print(cross_val_score(knn, X_train, y_train, cv=4)) 
»> print (cross val score (lr, X, y, cv=2)) 


Tune Your Model 


Grid Search 


»> from sklearn.grid_search import GridSearchCV 
»> params = { "n_neighbors" : np . arange (1,3) , 

"metric": ["euclidean", "cityblock" ]} 
»> grid = GridSearchCV(estimator=knn, 

param_grid=params) 

»> grid.fit (X_train, y_train) 

»> print (grid.best_score_) 

»> print(grid.best_estimator_.n_neighbors) 


Randomized Parameter Optimization 


»> from sklearn.grid_search import RandomizedSearchCV 
»> params = { "n_neighbors" : range (1,5), 

"weights": ["uniform", "distance"]} 

»> rsearch = RandomizedSearchCV(estimator=knn, 

param_distributions=params, 
cv=4, 
n_iter=8, 
random_state=5) 

»> rsearch .fit (X_train, y_train) 

»> print (rsearch.best_score_) 
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Matplotlib 
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Matplotlib 


Matplotlib is a Python 2 D plotting library which produces 
publication-quality figures in a variety of hardcopy formats 
and interactive environments across v ^^4-^i^4-i:u 

platforms. *matplOtllb 


1 ) Prepare The Data 


Also see Lists & NumPy 


»> import numpy as np 

»> x = np . linspace (0, 10, 100) 

»> y = np. cos (x) 

>>> z = np.sin(x) 


2P Data or Images 


>» data = 2 * np.random.random((10, 10)) 

>» data2 = 3 * np. random. random ((10, 10)) 

»> Y, X = np .mgrid [-3:3:lOOj, -3:3: 100 j) 

»> U = -1 - X**2 + Y 
»> V = 1 + X - Y**2 

>» from matplotlib.ebook import get_sample_data 

>» img = np. load (get_sample_data ( 1 axes_grid/bivariate_normal. npy' )) 


2) Create Plot 


>» import matplotlib. pyplot as pit 


»> fig = plt.figure() 

>» fig2 = pit .figure (figsize=plt .figaspect (2.0) ) 


All plotting is done with respect to an Axes. In most cases, a 
subplot will fit your needs. A subplot is an axes on a grid system. 


>>> fig. add_axes () 

>>> axl = fig. add_subplot (221) # row-col-num 

»> ax3 = fig. add_subplot (212) 

»> fig3, axes = pit. subplots (nrows=2, ncols=2) 
»> fig4, axes2 = pit. subplots (ncols=3) 


3 ) Plotting Routines 


»> fig, ax = pit. subplots () 

>» lines = ax.plot(x,y) 

>» ax . scatter (x, y) 

»> axes [0,0] .bar ( [1,2,3], [3,4,5]) 
»> axes [1,0] .barh ( [0.5,1,2.5], [0,1,2]) 
>» axes [ 1,1 ] . axhline (0.45) 

»> axes [0,1] . axvline (0.65) 

>>> ax .fill (x, y, color= ' blue ' ) 

>» ax .fill_between (x, y, color=' yellow' ) 


Plot Anatomy & Workflow 


Plot Anatomy 


Axes/Subplot 



tOO + -'W H 


Workflow 


The basic steps to creating plots with matplotlib are: 

Prepare data Create plot Plot Customize plot Save plot Show plot 


Figure 


>>> import matplotlib.pyplot as pit 
»> x = [1,2,3,4] 

»> y = [10,20,25,30] 

>» fig = pit .figure () • 

>» ax = fig. add_subplot (111) - 

> ax.plot (x, y, color= 'lightblue' 
»> ax. scatter ( [2,4, 6] , 

[5,15,25], 
color= 'darkgreen' , 
marker= ' A ' ) 

> ax.set_xlim(1, 6.5) 

>» pit. savefig ( ' f oo . png' ) 

>» pit. show () 


linewidth=3 ] 


4) Customize Plot 


Colors, Color Bars & Color Maps 


Mathtext 


»> plt.plot(x, x, x, x**2, x, x**3) 

»> ax.plot (x, y, alpha = 0.4) 

>>> ax.plot(x, y, c='k') 

>>> fig.colorbar(im, orientation= 'horizontal' 
>>> im = ax.imshow(img, 

cmap= 'seismic' ) 


>» pit.title(r '$sigma i=15$' , fontsize=20) 


Markers 


»> fig, ax = pit. subplots () 

»> ax . scatter (x, y, marker=" ." ) 
»> ax.plot(x,y, marker="o" ) 


Linestyles 


»> pit. plot (x, y, linewidth=4.0) 

»> pit .plot (x, y, ls=' solid' ) 

»> pit.plot(x,y,ls= '--' ) 

»> plt.plot(x,y, ' ' ,x**2,y**2, ') 

»> pit. setp (lines, color= ' r ' , linewidth=4.0) 


Text & Annotations 


>>> ax.text(1, 

- 2 . 1 , 

'Example Graph', 
style= 'italic' ) 

>>> ax.annotate( "Sine" , 

xy=(8, 0), 

xycoords= 'data' , 
xytext=(10.5, 0), 
textcoords= 'data' , 
arrowprops=dict(arrowstyle="->", 

connectionstyle="arc3 " ) , ) 


Limits, Legends & Layouts 


Limits & Autoscaling 


»> ax .margins (x=0.0, y=0.1) 

Add padding to a plot 

»> ax . axis (' equal' ) 

Set the aspect ratio of the plot to i 

»> ax . set (xlim= [0,10.5] , ylim= [-1.5,1.5] ) 

Set limits for x-and y-axis 

>» ax. set xlim (0,10.5) 

Set limits for x-axis 

Legends 


»> ax . set (title= ' An Example Axes', 

Set a title and x-and y-axis labels 

ylabel= 'Y-Axis' , 


xlabel= 'X-Axis' ) 


»> ax. legend (loc= 'best' ) 

No overlapping plot elements 

Ticks 


>>> ax.xaxis.set(ticks=range(1,5), 

Manually set x-ticks 

ticklabels=[3,100,-12, "foo"] ) 


»> ax. tick params (axis= ' y' , 

Make y-ticks longer and go in and out 

direction= 'inout' , 


length=10) 


Subplot Spacing 


>>> fig3 . subplots adjust(wspace=0.5, 

Adjust the spacing between subplots 

hspace=0.3, 


left=0.125, 


right=0.9, 


top=0.9, 


bottom=0.1) 


>>> fig. tight layout () 

Fit subplot(s) in to the figure area 

Axis Spines 


>>> axl.spines [' top ']. set visible (False) 

Make the top axis line for a plot invisible 

>>> axl. spines [ 'bottom' ]. set position ((' outward' , 10) ) 

Move the bottom axis line outward 


Vector Fields 



Draw points with lines or markers connecting them 
Draw unconnected points, scaled or colored 
Plot vertical rectangles (constant width) 

Plot horiontal rectangles (constant height) 

Draw a horizontal line across axes 
Draw a vertical line across axes 
Draw filled polygons 
Fill between y-values and o 


»> axes [ 0,1 ] . arrow (0,0, 0.5, 0.5) 
»> axes [ 1,1 ]. quiver (y, z) 

»> axes [0,1] . streamplot (X, Y, U, V) 


Add an arrow to the axes 
Plot a 2 D field of arrows 
Plot a 2 D field of arrows 


Data Distributions 


»> axl.hist (y) 

»> ax3 .boxplot (y) 

>>> ax3.violinplot(z) 


Plot a histogram 

Make a box and whisker plot 

Make a violin plot 


.5 

Save Plot 


Save figures 

»> pit. savefig (' foo .png' ) 

Save transparent figures 

»> pit. savefig (' foo . png ' , transparent=True) 



6 

1 Show Plot 

|>>> pit. show () 



2D Data or Images 


Close & Clear 


>» fig, ax = pit. subplots () 


>» im = ax . imshow (img. 

Colormapped or RGB arrays 

cmap='gist earth'. 


interpolation= 'nearest' , 


vmin=-2, 


vmax=2) 



>» axes2 [ 0 ] .pcolor (data2) 

>» axes2 [ 0 ] .pcolormesh (data) 
»> CS = pit.contour (Y,X,U) 
»> axes2 [2 ] . contourf (datal) 
»> axes2[2]= ax. clabel (CS) 


Pseudocolor plot of 2 D array 
Pseudocolor plot of 2 D array 
Plot contours 
Plot filled contours 
Label a contour plot 


»> pit. cla () 

Clear an axis 

»> pit. elf () 

Clear the entire figure 

»> pit. close () 

Close a window 
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Statistical Data Visualization With Seaborn 


The Python visualization library Seaborn is based on 
matplotlib and provides a high-level interface for drawing 
attractive statistical graphics. 

Make use of the following aliases to import the libraries: 


»> import matplotlib. pyplot as pit 
»> import seaborn as sns 


The basic steps to creating plots with Seaborn are: 

1. Prepare some data 

2 . Control figure aesthetics 

3 . Plot with Seaborn 

4. Further customize your plot 


") 


»> import matplotlib .pyplot as pit 
»> import seaborn as sns 
»> tips = sns . load_dataset ( "tips" 

>» sns . set_style ( "whitegrid" ) 

»> g = sns . lmplot (x=" tip" , 

y="total_bill" , 
data=tips, 
aspect=2) 

»> g = (g.set_axis_labels( "Tip" , "Total bill(USD)"), 
set(xlim=(0,10),ylim=(0,100))) 

»> pit. title ( "title" ) 

»> pit. show (g) 


1 ) Data 


Also see Lists, NumPy & Pandas 


»> import pandas as pd 
>>> import numpy as np 

>>> uniform_data = np.random.rand(10, 12) 

>>> data = pd.DataFrame({ 'x' :np.arange(1,101), 

'y' :np.random.normal(0,4,100) 


Seaborn also offers built-in data sets: 


»> titanic = sns . load_dataset ( "titanic" ) 
»> iris = sns . load dataset ( "iris" ) 


2) Figure Aesthetics 


3 ) Plotting With Seaborn 


Axis Grids 


Seaborn styles 


>>> sns.set() 

>>> sns.set_style( "whitegrid" ) 

»> sns . set_style ( "ticks" , 

{ "xtick.major.size" 
"ytick.major.size" 
>>> sns.axes_style( "whitegrid" ) 


»> g = sns . FacetGrid (titanic. 

Subplot grid for plotting conditional 

col="survived" , 

relationships 

row="sex" ) 


»> g = g .map (pit. hist, "age" ) 


>>> sns.factorplot (x="pclass" , 

Draw a categorical plot onto a 

y=" survived" , 

Facetgrid 

hue="sex" , 


data=titanic) 


»> sns.lmplot (x=" sepal width", 

Plot data and regression model fits 

y="sepal length". 

across a FacetGrid 

hue="species" , 


data=iris) 



Categorical Plots 


Scatterplot 


»> sns . stripplot (x="species" , 

Scatterplot with one 

y="petal length", 
data=iris) 

categorical variable 

>» sns . swarmplot (x="species" , 

Categorical scatterplot with 

y="petal length", 
data=iris) 

BarChart 

non-overlapping points 

>» sns.barplot (x="sex" , 

Show point estimates and 

y="survived" , 

confidence intervals with 

hue="class" , 
data=titanic) 

Count Plot 

scatterplot glyphs 

>» sns.countplot (x="deck" , 

data=titanic, 
palette="Greens d") 

Point Plot 

Show count of observations 

>» sns.pointplot (x="class" , 

Show point estimates and 

y=" survived" , 

confidence intervals as 

hue="sex" , 
data=titanic, 
palette={ "male" : "g" , 

"female" : "m" }, 
markers= [ , "o" ] , 

linestyles=[]) 

Boxplot 

rectangular bars 

>» sns.boxplot (x="alive" , 
y="age" , 

hue="adult male", 
data=titanic) 

Boxplot 

>» sns.boxplot(data=iris, orient="h") 

Violinplot 

Boxplot with wide-form data 

»> sns.violinplot (x="age" , 
y="sex" , 
hue=" survived" , 
data=titanic) 

Violin plot 


»> f, ax = pit. subplots (figsize= (5, 6) ) Create a figure and one subplot 


Context Functions 


(Re)set the seaborn default 
Set the matplotlib parameters 
Set the matplotlib parameters 


Return a diet of params or use with 
ith to temporarily set the style 


Color Palette 


»> sns . set_palette ("husl", 3) 
»> sns . color_palette ( "husl" ) 
»> flatui = ["#9b59b6", "#3498db" 
»> sns . set_palette (flatui) 


Define the color palette 

Use with with to temporarily set palette 

"#95a5a6","#e74c3c","#34495e","#2ecc71"] 

Set your own color palette 


»> h = sns . PairGrid (iris) 

Subplot grid for plotting pairwise 

>>> h = h.map(pit.scatter) 

relationships 

>>> sns.pairplot(iris) 

Plot pairwise bivariate distributions 

»> i = sns . JointGrid (x="x". 

Grid for bivariate plot with marginal 

y="y", 

univariate plots 

data=data) 


»> i = i .plot (sns . regplot. 


sns.distplot) 


»> sns . jointplot ( "sepal length". 

Plot bivariate distribution 

"sepal width". 


data=iris, 


kind= 'kde' ) 



Regression Plots 


>» sns.regplot (x="sepal width". 

Plot data and a linear regression 

y="sepal length", 

model fit 

data=iris, 


ax=ax) 



Distribution Plots 


»> plot = sns . distplot (data. y. 

Plot univariate distribution 

kde=False, 


color="b" ) 



Matrix Plots 


>>> sns.heatmap(uniform_data,vmin=0,vmax=l ) Heatmap 


4 ) Further Customizations_ Aisosee Matplotlib 


Axisgrid Objects 


»> g. despine (left=True) 

Remove left spine 

»> g.set ylabels ( "Survived" ) 

Set the labels of the y-axis 

»> g.set xticklabels (rotation=45) 

Set the tick labels for x 

»> g.set axis labels ( "Survived" , 

Set the axis labels 

"Sex" ) 


>» h. set (xlim= (0,5) , 

Set the limit and ticks of the 

ylim=(0,5), 
xticks=[0,2.5,5], 

x-and y-axis 

yticks=[0,2.5,5]) 



»> pit. title ("A Title") 

»> pit. ylabel ( "Survived" ) 
»> pit. xlabel ( "Sex" ) 

»> pit. ylim (0,100) 

>» pit. xlim (0,10) 

»> pit. setp (ax, yticks= [0,5] ) 
>» pit. tight_layout () 


Add plot title 

Adjust the label of the y-axis 
Adjust the label of the x-axis 
Adjust the limits of the y-axis 
Adjust the limits of the x-axis 
Adjust a plot property 
Adjust subplot params 


Also see Matplotlib 


5) Show or Save Plot 


Also see Matplotlib 


»> sns . set context ( "talk" ) 

Set context to "talk" 

»> sns . set context ( "notebook" , 

Set context to "notebook", 

font scale=1.5. 

scale font elements and 

rc={ "lines.linewidth" :2.5}) 

override param mapping 


»> pit. show () 

Show the plot 

»> pit. savefig ("foo.png") 

Save the plot as a figure 

»> pit. savefig ("foo .png" , 

Save transparent figure 

transparent=True) 



Close & Clear 


Also see Matplotlib 


»> pit. cla () 

Clear an axis 

»> plt.clf () 

Clear an entire figure 

»> pit. close () 

Close a window 
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Python For Data Science Cheat Sheet 

Bokeh 


Learn Bokeh Interactively at www.DataCamp.com, 
taught by Bryan Van de Ven, core contributor 



Plotting With Bokeh 


The Python interactive visualization library Bokeh 
enables high-performance visual presentation of 
large datasets in modern web browsers. 

Bokeh’s mid-level general purpose bokeh.plotting 
interface is centered around two main components: data 
and glyphs. 


+ = 

• li 

data glyphs plot 





The basic steps to creating plots with the bokeh. plotting 
interface are: 

1. Prepare some data: 

Python lists, NumPy arrays, Pandas DataFrames and other sequences of values 

2. Create a new plot 

3. Add Tenderers for your data, with visual customizations 
4 - Specify where to generate the output 

5 . Show or save the results 


>» from bokeh. plotting import figure 
»> from bokeh. io import output_file, show 
»> x= [1, 2, 3, 4, 5] 

»> y = [6, 7, 2, 4 , 5] 

»> p = figure (title="simple line example", 
x_axis_label= 'x' , 
y_axis_label= 'y' ) 

>» p.line(x, y, legend="Temp .' 

»> output_fileClines.html") 

>» show(p) 


line width=2) 


1 


) Data 


Also see Lists, NumPy & Pandas 


Under the hood, your data is converted to Column Data 
Sources. You can also do this manually: 


»> import numpy as np 
»> import pandas as pd 

»> df = pd. DataFrame (np . array ( [ [ 33.9, 4, 65, 'US'], 

[32.4,4,66, 'Asia'], 
[21.4,4,109, 'Europe']]), 
columns=[ 'mpg' , 'cyl' , 'hp', 'origin'], 

index= [' Toyota' , 'Fiat', 'Volvo']) 

»> from bokeh.models import ColumnDataSource 


»> cds_df = ColumnDataSource (df) 



>>> from bokeh. plotting import figure 

>>> pi = figure (plot_width=300, tools= 'pan,box_zoom' ) 
>>> p2 = figure(plot_width=300, plot_height=300, 


x_range=(0, 8), y_range=(0, 8)) 
»> p3 = figure () 


3 ) Renderers & Visual Customizations 


Grid Layout 


1 


Scatter Markers 


»> pi. circle (np . array ( [ 1,2,3 ] ) , np . array ( [ 3,2,1 ] ) 
fill_color= ' white ' ) 

»> p2 . square (np . array ( [ 1 .5, 3 .5,5.5 ] ) , [1,4,3], 

color= 'blue' , size=l) 


Line Glyphs 

»> pi. line ( [1,2,3,4] , [3,4,5, 6], line_width=2) 

»> p2 .multi_line (pd. DataFrame ([[1,2,3], [5,6,7]]), 
pd.DataFrame([[3,4,5], [3,2,1]]), 
color="blue" ) 


Customized Glyphs Also see Data 


Selection and Non-Selection Glyphs 

»> p = figure (tools= ' box_select' ) 

»> p . circle (' mpg ' , 'cyl', source=cds_df, 
selection_color= 'red' , 
nonselection_alpha=0.1) 




Hover Glyphs 

»> from bokeh.models import HoverTool 

»> hover = HoverTool(tooltips=None, mode= 'vline' ) 

»> p3 . add_tools (hover) 


Colormapping 

»> from bokeh.models import CategoricalColorMapper 
»> color_mapper = CategoricalColorMapper( 

factors=[ 'US' , 'Asia', 'Europe'], 
palette=[ 'blue' , 'red', 'green']) 
»> p3.circle( 'mpg' , 'cyl', source=cds_df, 
color=dict (field= ' origin ' , 

transform=color_mapper), 
legend= 'Origin' ) 


Legend Location 


Inside Plot Area 

>» p. legend. location = ' bottom_left' 

Outside Plot Area 

>» from bokeh.models import Legend 

>» rl = p2 . asterisk (np . array ([ 1,2,3 ]) , np . array ( [ 3,2,1 ] ) 
»> r2 = p2 .line ( [1,2,3,4] , [3,4,5,6]) 

>» legend = Legend(items=[( "One" , [pi, rl]), ( "Two" , [r2])], 
location=(0, -30)) 

>» p . add_layout (legend, 'right') 


Legend Orientation 


>» p.legend.orientation = "horizontal" 
>» p.legend.orientation = "vertical" 


Legend Background & Border 


»> p . legend. border_line_color = "navy" 

»> p . legend. background_fill_color = "white" 


Rows & Columns Layout 


Rows 

»> from bokeh. layouts import row 
»> layout = row (pl,p2,p3) 

Columns 

»> from bokeh. layouts import columns 
»> layout = column (pi, p2, p3) 

Nesting Rows & Columns 

»>layout = row (column (pi,p2) , p3) 


() CO 


>>> from bokeh.layouts import gridplot 
>>> rowl = [pl,p2] 

>>> row2 = [p3] 

»> layout = gridplot ([ [pi, p2 ], [p3 ]] ) 


Tabbed Layout 


>>> from bokeh.models.widgets import Panel, Tabs 
>» tabl = Panel (child=pl, title="tabl" ) 

>» tab2 = Panel (child=p2, title=" tab2" ) 

>» layout = Tabs (tabs= [tabl, tab2]) 


Linked Plots 


Linked Axes 

»> p2.x_range = pl.x_range 
»> p2.y_range = pl.y_range 

Linked Brushing 

>» p4 = figure (plot_width = 100, 

tools= 'box_select,lasso_select' ) 
>» p4 . circle (' m PU' t 'cyl', source=cds_df) 

>» p5 = figure (plot_width = 200, 

tools= 'box_select,lasso_select' ) 
>» p5 . circle (' m PU ' t ' hp' , source=cds_df) 

>» layout = row(p4,p5) 


4 ) Output & Export 


Notebook 


>» from bokeh.io import output_notebook, show 
»> output_notebook () 


HTML 


Standalone HTML 

»> from bokeh. embed import file_html 
»> from bokeh. resources import CDN 
»> html = file_html (p, CDN, "my_plot") 


»> from bokeh. io import output_file, show 
»> output file ( ' my bar chart. html' , mode= ' cdn ' ) 


Components 

»> from bokeh. embed import components 
»> script, div = components (p) 


>>> from bokeh.io import export_png 
»> export_png (p, filename="plot .png" ) 


>>> from bokeh.io import export_svgs 
>>> p.output_backend = "svg" 

>>> export_svgs (p, filename="plot . svg" ) 


5) Show or Save Your Plots 


»> show (pi) 

»> show (layout) 

»> save (pi) 

»> save (layout) 
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